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DETAILED ACTION 
Response to Amendment 

1 . Applicant has submitted an amendment filed 3/1 6/2005, amending claims 1 9, 25, 
31 , 39, and 47-49, while arguing to traverse the art rejection based on an amended 
limitation regarding "identifying a known speaker from among the plurality of speakers" 
(see claim amendment). Applicant's arguments have been fully considered but they are 
not persuasive. Kimber et al. (US 5598507) teach the step of identifying a speaker from 
among a plurality of speaker by using speech models trained by said plurality of 
speakers (col. 7, lines 41-67, recognizing speaker based on "usual speaking style of the 
speaker— the actual words used are unimportant"). As thus, previous ground of 
rejection is maintained. 

Claim Rejections - 35 (JSC § 102 

2. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - (b) the invention was patented or described in a printed 
publication in this or a foreign country or in public use or on sale in this country, more than one year prior to 
the date of application for patent in the United States. 

3. Claims 25-30, 39-41 , 43-47, and 49 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Kimber et al. (US Patent No. 5598507). 

4. Regarding claim 39, Kimber et al. disclose an apparatus for processing a 
continuous audio stream containing human speech from a plurality of speakers related 
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to at least one particular transaction, comprising: a predeterminer which predetermines 
at least one known speaker from among the plurality of speaker (element 212 in figure 
12, initial training of speaker models HMMs and/or col. 7, lines 41-67, recognizing 
speaker based on "usual speaking style of the speaker— the actual words used are 
unimportant"); a detector which detects speaker changes in the audio stream (col. 11, 
In. 38-67); a recognizer which recognizes the predetermined speaker in the audio 
stream (col. 11, In. 38 to col. 12, In. 27); an indexerfor indexing the audio stream 
dependent on a detected speaker change and a recognized predetermined speaker (fig. 
12 or col. 11, In. 38 to col. 12, In. 27). 

5. Regarding claims 25, 43, and 49, Kimber et al. disclose a method, apparatus, 
and program storage device readable by machine for processing a continuous audio 
stream containing human speech of a plurality of speakers related to at least one 
particular transaction, comprising the steps of: identifying a known speaker from among 
the plurality of speakers (col. 7, lines 41-67, recognizing speaker based on "usual 
speaking style of the speaker— the actual words used are unimportant'); digitizing the 
continuous audio stream (ADC is inherently included in a computer system of figure 11); 
detecting a speaker change in the digitized audio stream (col. 11, In. 38-67); performing 
a speaker recognition if a speaker change is detected (col. 1 1, In. 38 to col. 12, In. 27); 
indexing the audio stream with respect to the detected speaker change if the known 
speaker is recognized (fig. 12 or col. 11, In. 38 to col. 12, In. 27). 
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6. Regarding claims 26 and 44, Kimber et al. further disclose a method and 
apparatus according to claims 25 and 39, comprising the further step of protocolling 
time information for detected speaker changes (col. 11, In. 21-37). 

7. Regarding claims 27 and 40, Kimber et al. further disclose a method an4 
apparatus according to claims 25 and 31 , wherein the step of detecting a speaker 
change and/or the step of performing a speaker recognition is/are preceded by the 
further step of detecting non-speech boundaries between continuous speech segments 
(col. 12, In. 1-10, specifically elements 21 2 or 21 6 in figure 12). 

8. Regarding claim 28, Kimber et al. further disclose a method according to claim 
25, wherein the step of detecting a speaker change is accomplished by use of at least 
one characteristic audio feature, in particular features derived from the spectrum of the 
audio signal (col. 12, In. 1-20, spectral feature vectors to train HMM are derived from 
audio signal for comparison with stored models). 

9. Regarding claim 29, Kimber et al. further disclose a method according to claim 
25, wherein the step of performing a speaker recognition involves the particular steps of 
calculating a speaker signature from the audio stream and comparing the calculated 
speaker signature with at least one known speaker signature (col. 5, In. 10-27, spectral 
feature vectors used to train the HMM are speaker signatures). 
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10. Regarding claim 30, Kimber et al. further disclose a method according to claim 
25 for use in a speech recognition or voice control system comprising at least two 
speaker-specific speaker models and/or dictionaries, wherein interchanging between 
the at least two speaker-specific dictionaries dependent on the detected speaker 
change and the corresponding recognized speaker {col. 11, In. 13 to col. 12, In. 20 and 
figure 9). 

1 1 . Regarding claim 41 , Kimber et al. further disclose an apparatus according to 
claim 39, further comprising a scanner which automatically scans a continuous audio 
record, in particular a continuous audio stream recorded on a data or a signal carrier, 
and for detecting speaker changes in the continuous audio record (figure 11 or col. 11, 
In. 13-37). 

12. Regarding claim 45, Kimber et al. further disclose an apparatus according to 
claim 39, comprising means for marking at least the beginning of a detected speech 
segment related to a predetermined speaker (col. 11, In. 21-37). 

1 3. Regarding claim 46, Kimber et al. further disclose an apparatus according to 
claim 39, comprising database, which stores speech signatures for at least two 
speakers (the operation of figure 12 stores initial training speaker models). 
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14. Regarding claim 47, Kimber et al. disclose a speech recognition processing an 
incoming audio stream containing human speech from a plurality of speakers and 
having at least two speaker models and/or speaker-specific dictionaries, comprising: a 
detector which detects a speaker change in the incoming audio stream {col. 11, In. 38- 
67); a gather which gathers speaker-specific information with corresponding speaker- 
specific information of at least one predetermined known speaker from among the 
plurality of speakers thus recognizing the at least one predetermined speaker (col. 5, In. 
10 to col. 10, In. 67, input audio signal is parameterized into feature vectors for 
comparing with the speaker templates and/or col. 7, lines 41-67, recognizing speaker 
based on "usual speaking style of the speaker— the actual words used are 
unimportant"); and an interchanger which interchanges between the at least two 
speaker-specific dictionaries dependent on the detected speaker change and the 
corresponding recognized speaker (figure 12, the system of figure 12 contains a 
number of trained speaker recognized models, each is compared with input models to 
determine a speaker match). 

Claim Rejections - 35 USC § 103 

1 5. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 
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1 6. Claims 1 9-24, 31-38, 42, and 48 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Kimber et al. (US Patent No. 5598507) in view of Glickman et al. (US 
Patent No. 6076059). 

1 7. Regarding claim 31 , Kimber et al. disclose an apparatus for processing a 
continuous audio stream containing human speech from a plurality of speakers related 
to at least one particular transaction, comprising: a predeterminer which predetermines 
at least one known speaker from among the plurality of speakers (element 212 in figure 
12, initial training of speaker models HMMs); a detector which detects speaker changes 
in the audio stream (col. 11, In. 38-67); a recognizer which recognizes the 
predetermined speaker in the audio stream (col. 12, ln.1- 27). 

Kimber et al. fail to disclose an initiator which initiates transcription of at least part 
of the audio stream in case of a detected speaker change and a recognized 
predetermined known speaker. However, Glickman et al. teach an initiator which 
initiates transcription of at least part of the audio stream in case of a detected speaker 
change and a recognized predetermined known speaker (col. 5, In. 30-67). 

Since Kimber et al. and Glickman et al. are analogous art because they are from 
the same field of endeavors, it would have been obvious to one of ordinary skill in the 
art at the time of invention to modify Kimber et al. by incorporating the teaching of 
Glickman et al. in order to provide automatic closed-caption using speaker-dependent 
models to enhance speech recognition accuracy. 
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18. Regarding claims 19, 34, 42, and 48, Kimber et al. disclose a method, apparatus, 
and a program storage device readable by machine for processing a continuous audio 
stream containing human speech from a plurality of speakers related to at least one 
particular transaction, comprising the steps of: identifying a known speaker from among 
the plurality of speakers {col. 7, lines 41-67, recognizing speaker based on "usual 
speaking style of the speaker— the actual words used are unimportant"); digitizing the 
continuous audio stream (ADC is inherently included in a computer system of figure 11); 
detecting a speaker change in the digitized audio stream (col. 11, In. 38-67); performing 
a speaker recognition if a speaker change is detected (col. 12, In. 1- 27). 

Kimber et al. fail to disclose the step of transcribing at least part of the continuous 
audio stream if a predetermined speaker is recognized. However, Glickman et al. teach 
the step of transcribing at least part of the continuous audio stream if the known 
speaker is recognized (col. 5, In. 30-67). 

Since Kimber et al. and Glickman et al. are analogous art because they are from 
the same field of endeavors, it would have been obvious to one of ordinary skill in the 
art at the time of invention to modify Kimber et al. by incorporating the teaching of 
Glickman et al. in order to provide automatic closed-caption using speaker-dependent 
models to enhance speech recognition accuracy. 

19. Regarding claim 35, Kimber et al. disclose an apparatus for processing a 
continuous audio stream containing human speech related to at least one particular 
transaction, comprising the steps of: digitizing the continuous audio stream (ADC is 
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inherently included in a computer system of figure 11); detecting a speaker change in 
the digitized audio stream (col. 1 1, In. 38-67); performing a speaker recognition if a 
speaker change is detected (col. 11, In. 38 to col. 12, In. 27); indexing the audio stream 
with respect to the detected speaker change if a predetermined speaker is recognized 
(fig. 12 or col. 11, In. 38 to col. 12, In. 27). 

20. Regarding claims 20 and 36, Kimber et al. further disclose a method and 
apparatus according to claims 19 and 31, comprising the further step of protocolling 
time information for detected speaker changes (col. 11, In. 21-37). 

21 . Regarding claims 21 and 32, Kimber et al. further disclose a method and 
apparatus according to claims 19 and 39, wherein the step of detecting a speaker 
change and/or the step of performing a speaker recognition is/are preceded by the 
further step of detecting non-speech boundaries between continuous speech segments 
(col. 12, In. 1-10, specifically elements 212 or 216 in figure 12). 

22. Regarding claim 22, Kimber et al. further disclose a method according to claim 
19, wherein the step of detecting a speaker change is accomplished by use of at least 
one characteristic audio feature, in particular features derived from the spectrum of the 
audio signal (col. 12, In. 1-20, spectral feature vectors to train HMM are derived from 
audio signal for comparison with stored models). 
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23. Regarding claim 23, Kimber et al. further disclose a method according to 
claim19, wherein the step of performing a speaker recognition involves the particular 
steps of calculating a speaker signature from the audio stream and comparing the 
calculated speaker signature with at least one known speaker signature (col. 5, In. 10- 
27, spectral feature vectors used to train the HMM are speaker signatures). 

24. Regarding claim 24, Kimber et al. further disclose a method according to claim 
19 for use in a speech recognition or voice control system comprising at least two 
speaker-specific speaker models and/or dictionaries, wherein interchanging between 
the at least two speaker-specific dictionaries dependent on the detected speaker 
change and the corresponding recognized speaker (col. 11, In. 13 to col. 12, In. 20 and 
figure 9). 

25. Regarding claim 33, Kimber et al. further disclose an apparatus according to 
claim 31, further comprising a scanner which automatically scans a continuous audio 
record, in particular a continuous audio stream recorded on a data or a signal carrier, 
and for detecting speaker changes in the continuous audio record (figure 11 or col. 11, 
In. 13-37). 

26. Regarding claim 37, Kimber et al. further disclose an apparatus according to 
claim 31 , comprising means for marking at least the beginning of a detected speech 
segment related to a predetermined speaker (col. 11, In. 21-37). 
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27. Regarding claim 38, Kimber et al. further disclose an apparatus according to 
claim 31 , comprising database, which stores speech signatures for at least two 
speakers (the operation of figure 12 stores initial training speaker models). 

Conclusion 

The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Kuhn et al. (US 6141644) is considered pertinent to the claimed 
invention. 

Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the.date of this final action. 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Huyen Vo whose telephone number is 703-305-8665. 
The examiner can normally be reached on M-F, 9-5:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Doris To can be reached on 703-305-4827. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 

HXV 5/23/2005 
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